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PREFACE 


This work was directed by the Performance Analysis Group of the CAMS Oper- 
ations in support of Accuracy Assessment (SF4) involving many CAMS analysts 
for the initial assessment of the Phase III blind site segment estimates. 
These assessments were then reviewed by Performance Analysis Group for 
consistency of the data. The following personnel participated in the study: 
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1 . INTRODUCTION 

This report presents the analysis of labeling errors in the final Phase III 
estimate by Classification and Mensuration Subsystem (CAMS) Operations during 
the Large Area Crop Inventory Experiment (LACIE) from a subset of blind sites 
in five U.S. Great Plains (U.S.G.P.) states: North Dakota, Oklahoma, Montat.a, 
Colorado, and Minnesota. 


1. OBJECTIVES 

The objectives of the performa 'c-. analysis using the blind site data were to 

a. Identify the causes of labeling error and the factors involved in either 
overestimation or underestimation of the small-grain acreage and to 
provide data for a more detailed study. 

b. Quantify the labeling error of the dots used for the final classification 
estimate by CAMS Operations. 

c. Summarize and report the results of these evaluations. 

d. Transmit to CAMS Operations recommended suggestions in labeling procedures. 

1.2 SCOPE 

Because of manpower and time IVmitations and some lack of adequate ground- 
truth data, not all the U.S.G.P. states were included in the study. Of 
those states used, only a portion of the total blind site segments were 
evaluated for the same reasons. The five states studied and the number of 
segments used are as follows. 


State 

No. of segments used 

No. of usable blind 
sites in the state 

North Dakota 

18 

21 

Oklahoma 

11 

15 

Montana 

10 

23 

Minnesota 

6 

12 

Colorado 

6 

11 








The states and segments were selected by Accuracy Assessment (AA) personnel. 

The ground-truth data consisted of large-scale photographs and overlays with 
the crop type Indicated by field personnel of U.S. Department of Agriculture 
and a digital computer printout, provided by AA personnel, of the ground- 
truth In a matrix format of 209 coded numbers Identifying the crop for each 
field. 

The blind site ground-truth data are collected late In the growing season, 
thus permitting only the final season estimate to be used. Therefore, the 
results of this study are relative only to the final estimate passed to 
the Crop Assessment Subsystem (CAS). No mid-season data were used unless It 
was the last classification estimate. 
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2. BACKGROUND 


One of the major sources of underestimation In the LACIE proportion estimates 
has been found to be the misidentificatlon of small grains' signatures. How- 
ever, this statistical value alone provides no Insight on how to reduce this 
error source. The first step to a solution Is to Identify and quantify the 
reason for the mislabeling. 

During the latter part of the 1976-77 growing season (Phase III), CAMS 
Operations personnel used the Procedure 1 dot processing technique (ref. 1) 
for estimating the acreage of small grains. The accuracy of the proportion 
estimate of small to nonsmall grains Is normally compared to the actual 
proportion of small to nonsmall grains derived from the ground truth. 

The proportion estimate represented the gross effect of all errors f«^om 
all sources. Analyst labeling could also be quantified, but It was not 
specific enough to Identify the causes of the Individual dot labeling errors. 
Thus, a supplemental method was used on the blind site segments that had been 
developed for the Intensive test site (I1S) segments; namely, labeling error 
characterization. This technique attempts to separate factors used In 
Individual dot Interpretation/^abellng and relates labeling errors and causes 
to each other. 

The CAMS analyst estimated the wheat acreage of a segment by image inter- 
pretation of production film converter (PFC) products as described In ref- 
erence 2 and guided by the techniques of Procedure 1. This method of wheat 
acreage estimation is basically that of interpreting and labeling the upper 
left dot of a subset of the 209 grid intersections on the PFC products and 
using the spectral values of the labeled dots to provide the basis of the 
LACIE computer program to estimate the proportion of wheat in the segment. 

The labeling of the dots and the cause of the mislabeling are vital to the 
proportion estimate. 
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Procedure 1 required each analyst to use the same decision logic or deductive 
reasoning to Interpret the Imagery. The method of Interpretation Is basically 
a comparison of the fields' colors (spectral signatures) to each other through- 
out the growing season as manifested In the PFC Imagery, primarily Product 1. 

Products 2 and 3 were used basically as ancillary data for the decisionmaking 
process. 

Analysts tended to be conservative when Interpreting Imagery. To label a 
small-grain field or dot, the analyst had to have spectral and spatial evi- 
dence of smell grain. This not only Involved the dot that was to be labeled 
but also other dots on the Imagery that were both similar and dissimilar 
to It. It was Important for the analyst to follow the progression of all 
the signatures of all types through time (multiple acquisitions) and compare 
the progression with the expected phenological development of the small grain. 

If evidence suggested that the signature was that of a small grain, the 
picture element (pixel) was labeled small grain; otherwise. It was labeled | i 

nonsnall grain. This conservative rationale for labeling was necessary be- 
cau5;.e the analyst had to base his Judgment on repeatable evidence of physical 
conditions that were manifested In the spectral and spatial aspects of the 
Imagery. Otherwise, the labeling decision would have been Inconsistent, 

Illogical, arbitrary, and less likely to be correct. 

For example. In an Oklahoma segment during a drought-affected season when 
most of the wheat had turned, a narrow band of wheat, one pixel wide, around 
a small lake or pond developed phenologically more slowly than the rest of 
the wheat In the same field because of the greater amount of moisture 
available. The band of wheat remained a bright red, consistent with the 
heading stage; whereas the remaining portion of the wheat field displayed a 
dark gray or brown color on the PFC Imagery, consistent with the turning 
signature. Because weeds, grass, and trees frequently grow adjacent to 
standing water In wheat-growing areas of the U.S.G.P., the nonsmall -grain 
vegetation would be manifested on the imagery as bright red when wheat Is 
turning. When faced with this decision, the analyst would label the bright 
red band as nonwheat because this is the most frequent occurrence under these 




conditions. The analyst could not be expected to (fwflo that under this 
partioulca* circumstance the red band was truly wheat and not grass or weeds. 



t 




During the 1977 harvest season In Oklahwna, a segment had acquisitions rep- s 

resenting only the planting-to-early-emergence stage, a dormant stage, and 

the last acquisition well Into the turning/ripening stage. The Imagery I 

showed a poorly emerged small-grain signature In the first stage; the ' 

dormant stage was not helpful. The final stage of Mu' small-grain slqnatu^^ 

was so like the nonsmall-grain signature that the analyst missed a signifies."' 

amount of small grain In the segment. Since he had no signature evidence of 

small grain In most of the small -grain fields, the analyst had to turn In a 

low estimate even though he probably surmised this area to be a high small- 

grain production area. He could Justify the low estimate on the basis that 

numerous reports of drought were received for this area and that the low 

estimate would be consistent with that episodic event. 

The conservative approach does bias the labeling twa«*d underestimation of 

small grain. Under the circumstances, the analysts rtist continue In this 

manner until some yet unknown reliable Information can be made available 

for Interpretation. I 
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3. APPROACH 


In the search of a better definition for the reasons and/or causes of errone- 
ously labeled training fields or dots for Phase III ITS evaluations » an 
attempt was made to separate each repeatable facet of the Image Interpretation 
thought process and growth stages of small grains and then to tabulate the 
results per segment consistently. An effort was made to Identify the various 
causes by separating the errors Into separate spatial conditions (ref. 3). 

This study was useful In determining the Influence of boundary pixels on the 
Interpretation. However, the physical and Interpretative conditions under 
which the pixels were labeled were not part of the statistical analysis. Such 
conditions that were not considered were 

e Enumeration of the growth stages represented by the acquisitions available. 

e ConH>ar1son of the majority of the wheat signatures' development to the 
expected normal wheat signature of the adjusted crop calendar. 

e Various Interpretative confusions. 

This report expands upon the original concept of the labeling error characterl • 
xatlon and hopefully In^roves the Identification of the error causes and their 
relationships to each other. 

The rationale of the labeling error characterization Is to Identify and tabu- 
late the following: 

e Each normal physical condition of the growth stages that could be reflected 
or deduced from single or temporal Image Interpretation of the Imagery. 

e The "normal range" of the temporal spectral colors for each condition of 
the growth stages*for comparison of the abnormal colors In the Imagery. 

e The manifestations of the RFC Imagery's spectral response to episodic 
events . 
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• The spectral capabilities of the acquisitions available and Missing acqui- 
sitions that have Influenced the Interpretatlon/labeling. 

e The various types of causes of labeling errors and their relationship to 
each other. 

With this coMparlson of the nonaal to the abnormal data to identify errors, 
each error can then be tabulated and easily related to other error factors 
both logically and systematically. 

Statistical analysis can then be applied to the relationship of the rate of 
error between various combinations of factors. Synthesis of these results 
can provide data to enable project management to attack the larger sources 
cf error first and direct remedial action toward reducing the labeling error 
In the most efficient use of manpower and financial resources. 



4. DESCRIPTION OF THE LABELING ERROR CHARACTERIZATION FORMAT | 

1 

The labeling error characterization format evolved after many modifications | 

of the data recorded over several months. The format Included the description | 

of various categories, rearrangement of the tabulation format, and grouping I 

of similar <^actors and splitting of dissimilar or Important factors. | 

The correct base acquisition was determined by examining the. analyst's dot \ 

labeling form. The base acquisition used was the PFC acquisition selected I 

I 

for labeling the dots and registering all acquisitions for this temporal i 

classification. All the analyst's dot labels were carefully recorded sep- | 

arately for both Type 1 and 2 dots on the dot comparison form (fig. 1) In a ^ 

matrix of boxes In the format of the 209 dot Intersections on the PFC imagery. \ 

4.1 DOT COMPARISON FORM 

The ground- truth identity of each dot was carefully recorded In the lower 
half of the box using the digital ground- truth Information supplied by 
AA personnel. The computer-generated ground-truth data formed the basis of 
the unbiased assessment of the dot labeling error. The number of each 
correctly and Incorrectly labeled dot of the small ana nonsmall grains Is 
recorded to the right side, line by line. The number of true border/edge 
boundary pixels is recorded tcf’the left side In the appropriate small-grain 
or nonsmall -grain category based on the ground-truth printout. The count of 
boundary pixels recorded is the total number of boundary pixels (border/ edge) , 
whether or not the pixels were properly labeled. These are totaled at the 
bottom left side. 

The total number of strip/fallow fields indicated by the ground-truth printout 
is recorded at the bottom center of the dot comparison form. The total 
number of labeled strip/fallow fields that had an integrated spectral signa- 
ture was also listed. An integrated signature was a combination of two 
different spectral signatures of small fields that were averaged spectrally 
by the Land Satellite (Landsat) sensor's resolution capability as being 
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Figure 1 






































somewhere In between the two signatures. The number of Integrated strip/ 
fallow fields (dots) that were labeled as nonsmall grains Is also recorded at 
the bottom. To determine whether field signatures may be Integrated or 
not, the evaluator assessed whether the strip/fallow fields were large enough 
or too small to be manifested on the PFC Imagery as Individual fields. If 
the fields were large enough, the analyst was expected to be able to label 
them correctly; therefore, the labels might or might not be In error. If the 
fields were too small to be separated spectrally and spatially, they were 
counted as an Integrated signature: and the analyst's label was considered 
correct regardless of the difference with the ground- truth printout. 


After all errors, boundary pixels, and strip/fallow fields were totaled, all 
areas of designated other (DO) delineated by the analyst were checked for any 
Inclusion of small-grain labels, which would be automatic errors of omission 
and were recorded as such. The remaining dot labels, which show an agreement 
between the analyst and the digital ground truth, were checked for accuracy 
against the ground-truth photograph and overlay by careful manual comparison. 

If any labels were found In error, they were Indicated on the dot comparison 
form and recorded on the segment tabulation sheet as double disagreement (DD) 
(fig. 2). The totals at the bottom right of the dot comparison form are the 
results of the labeling errors according to the AA digital ground truth. The 
numbers were th? sums of the total nonsmall- and small-grain pixels labeled, 
followed by the total number of errors of nonsmall and small grains of those 
labeled. Included in the small-grain error were the number of small-grain dots 
excluded from classification by the DO area. (DO areas exclude all small 
grains, by definition.) The double disagreement errors were added only on the 
segment tabulation sheet. 

4.2 SEGMENT TABULATION SHEET 

The errors were listed on the segment tabulation sheet by each dot's discrete 
number according to its location on the matrix of 209 dots (fig. 3). This 
matrix of Type 2 dots is registered to the dot comparison form for convenience 
of identifying the pixel number. Only the Type 2 dots are explained here. 

(The Type 1 dots were evaluated in the same manner). 


I 
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Figure 3.— Matrix of Type 2 dots. 


On the segment tabulation sheet, the signature for each dot was evaluated 
Individually for Its representation of the growth stages In available acqui- 
sitions, confusion crops or conditions, and the apparent cause of the error. 

In addition, the labeling error evaluation provided a list, under the disagree- 
ment category, of tiiose pixels that were not In error but would appear so 
because of registration constraints In the computer program and, under the 
double disagreement category, of those that were In error about which the 
analyst and the computer program were In agreement. 

4.3 STATE TABULATION SHEET 

The results of the tabulation of errors for each segment were recorded on the 
state tabulation sheet (fig. 4). The raw data of this study are presented 
on the state tabulation sheets for each state in the appendix. 

The state tabulation sheet records the error causes by segment (vertically) 
and the causes by error group (horizontally) in part A. The total number of 
pixels per cause, separated Into omission and commission, are recorded along 
the right-hand margin with the applicable percentages adjacent to them. The 
total number of pixels on each line represents the total error of their 
category of either omission or commission. The sums of the number of pixels 
labeled per category are recorded in part B of the form. 

The numbers of the basic data group (part B) represent the total number of 
pixels labeled, separated into omission and commission and summed as total 
pixels labeled. The numbers in the digital matrix totals represent the 
omission and commission errors and represent the error tabulation of the 
digital ground truth determined from the comparison by the computer of its 
digital ground truth and the analyst's labels. The labeling error characteri- 
zation evaluation totals express the error totals of omission and commission 
of the errors per segment recorded in each error type on the state tabulation 
sheet (part A). These totals reflect the adjustments for the errors of 
disagreement and double disagreement. 
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(a) Part A. 

Figure 4.- State tabulation sheet. 






Sjmbol 


BH 

■HBB 

iniinnn 

nsB 

HBB 

BP» 

I — — ■ 

nuiHnni 

iiMnniMM 
IHIliM— ■ 


3 e 


6 


(b) Part A - Concluded. 
Figure 4.— Continued. 
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Th« category of disagreement Is not an error In the labeling per ee. but 
rather a record of the differences between the computer registration of the 
Imagery to the ground- truth photograph and the comparison of the two by the 
labeling error evaluation. These differences result from misregistration 
or mislabeling. Most of the differences were caused by the computer's mis- 
registration of a pixel category near a boundary with another field of a 
different category. The computer sutRllvIdes each pixel Into six subpixels. 

Use of the "rule of majority" by the con^uter, of the spectral value of each 
subpixel, forces the computer to decide In favor of the majority. However, 
these dots clearly show, by comparison of the PFC Image to the ground- truth 
large-scale aerial photograph and overlay, that the spatial and spectral 
properties belong to the other category. These disagreement pixels were 
assigned the kappa symbol (K) and recorded on the segment and state tabulation 
sheets. 

The second cause of disagreement, personnel's mislabeling of the field on the 
overlay, seldom occurs. This error was detected through a careful comparison 
of the ten^)ora1 signatures of the PFC Imagery. This disagreement Is recorded 
In the s1^ category (a) on the segment tabulation sheet. 

The double disagreement values are the additional pixels about which analyst 
and the digital ground truth agree. However, during the labeling error evalu- 
ation, evidence showed that the pixel was of another category. Double dis- 
agreements (DD) were recorded at the bottom of the state tabulation sheet. 

In all comparisons of the ground truth to the Imagery, the ground truth was 
considered correct until proven differently. The disagreement values added 
to the labeling error evaluation totals match the ground-truth matrix totals. 

The total number of border/edge pixels, regardless of error. Is recorded on 
the designated line of the state tabulation sheet. The data provide the 
basis for the percentage of boundary pixel errors of the total pixels labeled 
that occurred for each segment and the average occurrence of border/edge pixel 

for each state. 
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5. CAUSE CATEGORIES AND THEIR USE 


To evaluate the labeling accuracies correctly* the conditions under which the 
CAMS analyst worked should be recreated to the same degree as much as possible. 
This approach then requires consideration of the data available to the analyst 
and of the method of operation required by Procedure 1. 


5.1 AVAILABLE ACQUISITIONS 

All of the acquisitions that were available In the segment packet to the 
analyst at the time of the labeling for the classification estimate are to 
be considered, even those acquisitions that are not used for processing. 
Although some acquisitions are not used for the estimate, the spectral 
condition of these acquisitions still Influences the labeling decision. Even 
those with clouds and some snow cover contributed value toward the Interpre- 
tation and labeling. Those acquisitions that were placed In the segment 
packet after the analyst's estimate were not used for the labeling error 
characterization evaluation because they were not available to the analyst 
for the classification. 


After determining the acquisitions available for the estimate from the segment 
packet data, the labeling error characterization evaluator placed the acqui- 
sitions on a light table and assigned a growth stage symbol to each acquisi- 
tion, represented by a lowercase letter, as Indicated below. 


Symbol Growth stage 

a Planting through emergence 

b Postplanting, postemergence 

c Postemergence, jointing 

d Dormancy 

e Jointing through heading 

f Turning, ripening 

g Harvest 

h Postharvest 


Normal expected color 

Gray, black, generally black 

Less dark, brighter soil type 
signature as It dries 

Pinking up 

Pink to dark gray or green 
Pink to red 

Mottled red, yellow, olive, and 
grayish green 

White, green 

Pinking up, dark green-brown 
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The colors or shades were used as a guide or generr<; description to convey 
the tone of the acquisition's colors and are by no '^^ns the complete list 
of shades and colors for each stage. The Interpreter expects to see some 
variations of shar*? for the same crop, both of which are In the same growth 
stage. 

The assignment of the growth stages to the acquisitions was determined by the 
small-grain signature of the majority of the small-grain fields to be of a 
certain growth stage. This assignment was made for each acquisition available. 
Each growth stage was recorded only once on the segment tabulation sheet even 
though there may have been more than one acquisition for a particular stage. 
Under multiple acquisition conditions for a growth stage* all the applicable 
acquisitions to a single growth stage were averaged by the evaluator. 


5.2 ERROR ASSESSMENT OF INDIVIDUAL PIXELS 


Each error pixel was listed on the segment tabulation shr^et In numerical 
order from the dot comparison form along with Its type (1 or 2). Each dot 
type was assessed separately by group. On figure 2. the Type 1 dots are not 
listed to avoid rHundancy of explanation. 


The latest acquisition available for classification was the sole acquisition 
upon which the Judgment was made for the determination ef the adjusted crop 
calendar (ACC). The majority of the signatures for sniall-graln fields of that 
last available acquisition determined the designation of the ACC. As Indicated 
before, some small-grain field signatures may be either ahead or behind the 
ACC on the particular acquisition. A comparison wa*. made i<'7:ween the numerical 
value of the ACC, as scribed on the RFC Image by the .analyst, to the spectral 
signature of the majority of the small-grain fields. In* over. 11 sroctral 
signature was allowed a range of color that would be reasonable for the 
scribed ACC value. The latest acquisition's signature was then assessed to 
be either In agreement with, behind, or ahead of the ACC. This decision was 
then applied to all the error pixels in the manner described In the condition 
category on the segment tabulation sheet. 









5.3 rATrGORlES OF ERROR CAUSES 


5.3.1 CONDITION 

If the error pixel was labeled snail grain In the digital ground-truth print- 
out* the condition Is either 

e 1 •• In agreement with the ACC. 

e 2 “ behind the ACC. 

e 3 ■ ahead of the ACC. 

If the error pixel was labeled as nonsmall grain In the digital ground- truth 
printout* the condition Is either 

e 4 ■ In agreement with the ACC. 

e 5 ■ behind the ACC. 

e 6 ■ ahead of the ACC. 

5.3.2 CONFUSION VEGETATION 

This category Indicates the crop or vegetation with which the pixel's (field) 
spectral signature was confused. The list below explains the meaning of the 
symbols. Those confusion crops of the “other" category were written on the 
right-hand side of the segment tabulation sheet. 

1.0 Winter wheat labeled other: Confusion crop cannot be determined. 

1.1 Winter wheat confused with spring grains 

1.2 Confused with hay or alfalfa 

2.0 Nonwheat labeled wheat: Confusion crop cannot be determined. 

2.1 Confused with spring grains 

2.2 Confused with other small grains 

2.3 Confused with winter grains 

3.0 Spring wheat labeled other: Confusion crop cannot be determined. 

3.1 Spring wheat confused with winter grains 

3.2 Confused with hay or alfalfa 
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S.3.3 ACQUISITIONS AVAILABLE 

i 

LoMercasc IttUrs were recorded at the top of the column labeled "acquisitions i 

available" to Indicate the growth stages represented by acquisitions. The , 

letters correspond to the growth stages listed In section 5.1. The behind (<) i 

or ahead (>) symbol over a letter Indicates that th^ spectral response for 
that growth stage* manifested by the spectral response of the majority of the I 

small-grain fields* was either behind or ahead of the ACC. If no symbol Is ' 

written over the letter* the growth stage was In agreement with the ACC. 

For expediency* only the abnormal colors of the error pixel were Indicated 

under the corresponding growth stage by uppercase letters. The abnormal 

colors are listed In the table below. The blank areas for each pixel's 

growth stage Indicate that the color for that particular growth stage was 

within the normal or expected range. | 

Code Abnormal color 

A Pink* red i 

B Dark gray to black ' 

C Purple* dark brown, dark gold* etc. 

0 Yellow* gold* tan (lighter than C) ' 

E Whitish pink to gold to yellow 

F Green, blue 

5.4 EXPLANATION OF THE ERROR CAUSES 

The various causes of error are listed below with the corresponding 
explanation and symbol. 

o ■ Insufficient acquisitions. A lack of Informative acquisitions (those 
useful to the estimation) contributed to the cause of the labeling error. 

(Acquisitions that are hazy or cloudy, etc., or more than one acquisition 
In the same biostage may be only partially useful.) 
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3 ■ Poor stand of small grain, usually caused by abnormal weather conditions 
or cropping practices. (Reserved for use with 18-day field observations 
for specific fields.) 

Y ■ Abnormal development of small grain. 

■ Behind ACC (late planting and development). 

Y 2 * Ahead of ACC (early planting and development). 

c ■ Narrow strip fields. Single narrow fields - The field's signature may or 
may not be overridden by surrounding signatures. 

A ■ Clerical error. 

Wrong acquisition used for labeling, which Is the base acquisition. 
Analyst simply wrote the wrong acquisition number. 

Ag * The error pixel which clearly followed a temporal sequence for Its 
category. Since other pixels with the same temporal sequence were 
consistently Identified correctly, then this error pixel was most 
likely misidentifled. 

u * Double cropping practice of a second crop or weeds which have become the 
dominant signature and caused the increase In the Infrared response after 
harvest. 

n = Border/edge pixel. Indicating spectral and spatial confusion between two 
or more fields of different types. 

^ Unknown cause. Error does not apply to any of the known causes. 

X = Weak small -grain signature. Temporal color sequence Is followed, but 
colors are subdued. 

oj ® Field destroyed by grazing, plowing, disking, etc. 

0 = Signature of a small grain that does not follow the expected temporal 
color sequence of small grain throughout the acquisitions. 

V = Signature of a nonsmall grain that does follow the expected temporal 

color sequence of small grain throughout the acquisitions. 
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T ■ Volunteer wheat signature that does follow the temporal color sequence. 
Labeling from volunteer wheat was considered an error only after the 
availability of an acquisition In which a plowed-up signature occurred. 

6 • Small>gra1n signature confused with nonsmall grain signature. 

n > Nonsmall -grain signature confused with small -grain signature. 

< • Disagreement with AA digital ground truth. 

a ' Disagreement with ground- truth map (field) label. 

5.5 APPLICATION OF THE ERROR CAUSES 

The determination of the error causes is somewhat subjective, since someone 
other than the analyst has ascertained the causes of the errors. Even 
though the analyst was consulted as to why the error was made, except errors 
with obvious reasons. It was difficult for the analyst to remember the reason 
for labeling the pixel as he did. To maintain as much objectivity and 
consistency as possible, a second person reviewed each error analysis. It 
Is believed that the result of the error analysis Is reasonable and quite 
accurate; the exact accuracy Is not known. 
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A discussion of how each error cause was used follows. 

• a - Insufficient acquisitions, which are usually caused by the clouds 

obscuring the scene at the time of overpass of Landsat. This physical 

constraint is an overriding factor in the evaluation of errors. For 

example, in Oklahoma, during Phase III, a particular area had a large 

1^, amount of abandoned wheat. There were only two acquisitions -■ one during 

early emergence and the other in senescence, or after the small grain 

I began to ripen (turning). For an analyst to determine that a field was 

! abandoned, the wheat must be abandoned before senescence with sufficient * ] 

1 

1 time for the Landsat imagery to reflect the change. A reasonable 1 

I amount of small-grain fields should be harvested so that a comparison ^ 

can be made. Last, an acquisition must be obtained at this stage. : 

1 ■ 

this Oklahoma segment, the analyst bad confusion with the other 

► ^'.irvestcd small-qrain fields and no visible temporal evidence to i 



prove abandonment. The cause assessed to this type of error could have 
been that small grain did not follow the temporal color sequence of small 
grain (6), that small-grain signature Is confused with nonsmall -grain 
signature (6), or that the field was destroyed by plowing* grazing* etc. 

(w). 

e 6 - Poor stand of small grain. This cause was determined during the 
labeling error evaluation* but re-evaluation suggests that "poor stand" 
should be reserved for evaluation In which the specific field of the error 
pixel has a record of the 18-day observations to support It. The 6 poor 
stand causes that have been verified (usually on ITS segments) showed the 
field to be retarded in growth or behind the ACC, Therefore, for this 
final synthesis of five USGP states* the errors counted in this category 
were Included In the (y^) abnormal development of small grain. 

• Y ~ Abnormal development of small grain (wheat). Both types of causes 
(y^and ^ 2 * behind and ahead) are related to the growth stage of the specific 
field that the error pixel represents to the ACC value of the last acqui- 
sition. Regardless of the growth stage of moat of the small-grain fields* 
this cause was assessed to a particular field. The evaluation of all data 
from the five states suggests that the X^, behind- the-ACC cause, should 
Include the number of errors from S poor stand and x small-grain signature 
as well. 

# e - Narrow strip fields. This cause Is similar to the border/edge problem 
but Is partly due to the scanner resolution's Inability to differentiate 
the small size field, which is an Isolated field. 


• X - Clerical errors. Clerical errors are of two types: 

t X^ - Wrong acquisition used for labeling. This cause stems from the 
analyst’s use of a different acquisition for labeling the pixels than 
that Indicated on the CAMS evaluation form as the base acquisition. 
The acquisition Indicated was misregistered from the one used for 
labeling. 




>2 “ Inadvertent error. This Is used only when a signature has been 
correctly labeled several to many times and then mislabeled once or 

twice all on one acijuisition. 


25 


OF POOR 


PAGE ,s 

Quality 











t U - Ooubit cropping practice. There Is little difficulty In understanding, 
cause cr Its uw.'--- -- v.^- 

Border and edge pixels. Bo^er pixel Is result of ccmfusii^ ^ 
Identification between two different field typns. The spectral signature ^ 

. Is similar to both types by Integration of the specti^l reflectance, and 
the location of the pixel Is on the border of both fields. An ed^ pixel ^ 

^ error should not occur for Type 1 dots because of the require m e n ts of 
Procedure 1, but It does sometimes. The edge pixel Is clearly In one field 
or another In several acquisitions. The analyst did not recognize that 
the pixel changed location to a different field and thought It was a pure 
pixel, when In fact, due to a one^plxel shift In registration between two 
acquisitions, the error pixel changed crop type. 

a 4 - Unknown cause. Sometimes the evaluator cannot determine reasonable 
evidence for the error. 

a X small-grain signature. This reason for labeling error mis used 

for the evaluation, but only a few pixels were assigned to It. Review of 
the five-state data would suggest that this reason should be grouped 
together with the of abnormal small-grain signature since It Is almost 
the same condition (behind). 

aw- Destruction by plowing,^ grazing, etc. This cause requires the use of 

i 

Specific field data. It Is not often that a specific field is 
observed closely enough that the analyst can be sure this type of event 
occurred. 

i 

a 6 - Small-grain signature that does not follow the temporal color sequence. 

a u - Nonsmall -grain signature that does follow the temporal color sequence. 
Both 6 and u may override the Importance of other causes that may also be 
true, much like the a causes do. and generally for the same kind of reason. 
For Instance, an error may be also caused by the fact that It Is a poor 
stand (b); but If the signature does not follow the expected temporal color 
sequence which Is the basis of the Image Interpretation for small-grain 
classification, then the analyst cannot correctly label the pixel. 



• T - Volunteer wheat error that can be used only when ground-truth data for 
a specific field are available to the evaluator. 

e 6 - Small-grain and n nonsmall confusion errors that were used relatively 
little. They were used when the confusion occurred* and no other evidence 
was observed to support a different reason for the mislabeling. Re- 
evaluation of these causes suggests that they are too vague and that their 
use should be discontinued. 

Disagreement factors were not causes of analyst labeling error but reasons for 
the labeling error characterization evaluator lo disagree with the digital 
ground truth. These pixels were used to Increase the labeling accuracy above 
the error rate determined by the digital ground truth. 
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6. RESULTS 

The subset of the segments In Oklahoma and Colorado do not appear to be as 
representative of their states as they should. The results obtained from the 
LACIE proportions are somewhat different from those of this labeling study 
for Minnesota and Colorado. However, the data set was Included In the study 
because It was all that was available. The number of segments used In the 
study and the number of segments available by state are as follows. 


State 


No. of segments 
available 


No. of segments 
used In study 


North Dakota 

21 

18 

Oklahoma 

15 

11 

Montana 

23 

10 

Minnesota 

12 

6 

Colorado 

11 

6 


6.1 STRIP/FALLOW FIELDS 

The area extensively covered by strl p/fallow fields Is usually In the northern 
tier of the states. It was believed, prior to the labeling error characteri- 
zation, that analysts would tend to label the strlp/fallow areas as "other" 
crops rather than as small grains. If this were true. It would contribute 
to the underestimation of the LACIE proportion estimates. 


The labeling error characterization evaluators made a special tabulation to 
establish the facts. The labeling errors for strlp/fallow were sepa»^ated 
Into two groups. The first group consisted of pixels that were Identifiable; 
the second, of pixels that had an Integrated signature In which the separation 
of the strip fields could not be distinguished on the PFC Imagery. Because 
half of the Integrated signature strip fields were labeled "other," strip/ 
fallow fields did not contribute to the underestimation of the LACIE proportion 
estimates. 
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INSUFFICIENT ACQUISITIONS 


The labeling error characterization evaluation showed that the error rate of 
a segment was very high for those CAMS classification estimates for which 
Insufficient acquisitions for two Important growth stages were not available 
for an estimate. If these particular segments were used for the aggregation, 
then the LACIE proportion estimates would not be representative of the CAMS 
estimates. Therefore, better aggregation results could be obtained If "short- 
changed" segments were precluded by CAMS from the aggregation. 


Table 1, Comparison of Growth Stage Availability to Labeling Error, shows that 
three growth stages are required for the best, consistent labeling: post- 
emergence (b), jointing through heading (e), and either turning (f) or har- 
vest (g). It was not possible for this analysis to separate the value of 
stage (e) from that of (f) because the analyst Interprets by comparing the 
more vigorous plant stage of (e) to the less vigorous plant stages of (f) 
and (g). The postemergence stage (b) Is needed to separate and fix the begin- 
ning of the growth cycle. One might also conclude that stage (a), planting 
through emergence, was not Important. However, when mixed segments are 
Involved, the planting date becomes Important to separate the spring from the 
winter grains. If the available acquisitions had only an (a) stage and an (f) 
stage, the analyst would find It difficult to determine the senescence because 
he would not be able to compare the vitality of the (f) stage signature to a 
vigorous signature. Therefore, an analyst's confusion between natural vege- 
tation and other crops would very likely occur. 
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TABLE 1 .- COMPARISON OF GROWTH STAGE 
AVAILABILITY TO LABELING ERROR 
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An overall view of table 1 clearly demonstrates that the labeling error rate 
Is reduced as the available growth stages Increase. The bottom of the list 
where the least percentage of labeling error occurs has most of the "available 
growth stage represented" column filled, In constrast to the higher error 
segments where the "growth stages not represented" column Is filled more. 


Two segments had most of their labeling error caused by the a, Insufficient 
acquisition, error. A very high error rate Is evident when this condition 
occurs. Both segments 1365 and 1604 are at the top of the list on table 1. 


It would be reasonable to conclude from the results of table 1 that the availa- 
bility of turning to harvest growth stages for labeling contributes to lower 
labeling error. The higher error rate Is associated with the unavailability 
of turning to harvest growth stages. The following table shows the omission 
labeling error rate (Type 2 dots only) between the segments with and without 
postheading acquisitions. 


With postheading acquisitions 


Without postheading acquisitions 


With 

acquisitions 


Without 

acquisitions 


With 

acquisitions 


Without 

acqusitlons 


b. e 
16. 6X 


b, e 
23.8% 


b, e 
27.6* 


b, e 
27.5* 


Number of segments per category 


The least type of growth stages that should be available for the optimum 
collection of acquisitions are early emergence (b), Jointing to heading (e), 
and turning (f). 
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6.3.1 BOROER/EDGE 

On« should not judge the results of table 1 as being totally caused by missing 
acquisitions (growth stages) because other causes also Influence the results i 
such as border/edge (tt) and small grains that do not follow the tenoral color 
sequence (6). 

6.3.2 UNDERESTIMATION 

Ml sidentifl cation of small-grain signatures, which are omission errors, was 
one of the major sources of underestimation of the classification estimates 
during Phase III. The misidentificatlon of nonsmall -grain signatures, which 
are commission errors, causes overestimation and comprises a relatively small 
percentage of the labeiliig error. The following table shows the omission and 
commission errors for all the Type 2 dots in the five states. 

I I Omission I Commission I 


State 

No. error 
pixels 

No. pixels 
labeled 

No. error 
pixels 

No. pixels 
labeled 

North Dakota 

114 

455 

30 

563 

Oklahoma 

77 

318 

43 

440 

Montana 

38 

297 

17 

498 

Minnesota 

32 

145 

9 

206 

Colorado 

24 

114 

3 

286 

Total 

285 

1329 

102 

1993 


yjll ■ 21.4% Omission Error ■ 5.1% Commission Error 


In the five states Investigated, the omission error was 21.4 percent 
(1329 : 285) and the commission error was 5.1 (1993 i 102). The data showed 
that the Interpretation tended to be conservative. The commission error 
was low throughout the LACIE program - approximately 2 to 5 percent. 
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I 6.4 LARGEST ERROR CAUSES 

A tabulation of the labeling accuracies ami error causes for the Type 2 dots 
Is presented In table 2. The labeling accuracies are given for both omission 
and conmlsslon errors In each state. The single segments from North Dakota 
and Oklahoma with high error due to Insufficient acquisitions were excluded. 

Segments with high error due to strlp/fallow fields from Montana were also 
excluded. 

6.4.1 OMISSION ERRORS 

The omission accuracies (OA) were calculated by: 

OA ■ Total number of correctly labeled small grain dots ^ iqq 
T otal number of labeled ground- truth smali-grain dots ^ 

The commission accuracies (CA) were calculated by: 

CA ■ Total nu^er of corroctly labeled nonsmall grain dots ^ 

^ Total number of labeled ground- truth nonsmali grain dots 

The causes of the labeling error are given for both the omission and commission 
separated In the table. To make the omission and conmlsslon error rates 
comparable between each state, the errors have been averaged by dividing by 
the number of error pixels per cause by the total number of labeled pixels at 
the state level. 

The results of the causes of labeling error on table 2 show that 85 percent 
of the error causes was due to the following reasons (In descending order of 
the amount of error): 

a Border/edge pixels. 

e Small-grain signature that Is significantly behind the temporal color ] \ 

sequence of the majority of the small -grain signatures. • \ 

• The acquisitions available which provided an Insufficient representation of 

the crop growth stages needed for discrimination of the signatures. j ; 

• A small-grain signature that did not follow the temporal sequence of the • j 

sr.ill graif’ 
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TABLE 2.- EVALUATION OF LABELING ACCURACIES OF TYPE 2 DOTS OF PHASE III BLIND SITES 

IN FIVE U.S.G.P. STATES 
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6.4.2 COMMISSION ERRORS 


IWo ^yp«t of crops wore noro ropMtedly labolod snail grain as confusion crops: 
grass and Idit fallow. Both of those crop or land-use types occurred nore 
frequently than the others In all five states. 

In Oklahoma « the abandoned wheat cause was high, mainly due to the lack of 
acquisitions In the joIntIng-to-headIng stage, which precluded the analyst 
from determining the difference between the fields that were abandoned and 
those that were In the turning stage. 

6.4.3 GROUND-TRUTH ACCURACY 

The discrepancy In the error rates between the digital ground truth and the 
labeling error characterization was measurable. The differences are caused, 
primarily, by the local misregistration of pixels, as described In section 4.3. 
The use of the digital ground truth for determining the accuracy of classi- 
fication estimates should be used with caution. The difference between the 
two are shown below. 



Ground- truth type 

Error rate for all segments, X 

Omission 

Commission 

Digital ground truth ^ 

28.4 

8.9 

Labeling error characterization 

21.4 

5.1 

Difference 

7.0 

3.8 



These differences represent a 33-percent Increase of the omission error and a 
42.7-percent Increase In the commission rate. 


It should be clearly understood that the labeling error characterization only 
evaluated those pixels of the 209-dot matrix that fere labeled by the analyst. 
Although thr remainder of the 209 dots were not evaluated, It would seem likely 
the discrepancy would apply to these others also. 
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7. CONCLUSIONS AND RECOMMENDATIONS 

The Phase III labeling error characterization study shows that - 

• The results of this evaluation for the states of Minnesota and Colorado 
are probably too meager to be conclusive. The addition of segments to the 
evaluation for these states would make the results more meaningful. 

f Segments without one of the acquisitions representing early emergence, 
jointing to heading, and either turning or harvest had hioher omission 
error rates. These segments with this condition should I'' xcljded from 
the final classification estimate submitted for aggregation. 

e Mislabeling of the strip/fallow field areas produced an equal amount of 
small grain and nonsmall grain. In areas of strip/follow, the labeling did 
not contribute to the underestimation problem because of mislabeling. 

t Border/edge pixels caused the greatest amount of omission errors. If these 
pixels could be labeled by»some method other than by analyst interpretation, 
the underestimation caused by the border/ edge error might be reduced. Per- 
haps the analyst would only Identify the pixel as border/edge; then some 
simple procedure or a statistical manipulation by the computer would be 
useful . 

• The analysts basically did a fine job of labeling In Phase III. The 
omission error rate was 21 .4 percent, and the commission rate was 5.1 per- 
cent. The major portion of the underestimation (omission error) was 

caused by factors beyond the control of the analyst following the Interpreta- 
tion procedures as shown below. 

• 85 percent of the total omission error for the five states In descending 
order was due to border/edge pixels (k), to small -grain signatures that 
were significantly behind the temporal color sequence of the majority of 
the small-grain signatures (y^). the acquisitions available that provided 
an Insufficient representation of the crop growth stages needed for dis- 
crimination of the signatures, and small-grain signatures that did not 
follow the temporal color sequence of the small grain. 
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The analyst will probably always have a conservative bias toward any target 
crop because he must have consistent evidence to support the existence of 
the target crop. Otherwise, vague suppositions and guesses between two 
choices will be underestimated. 
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APPENDIX 


TABULATION SHEETS ON NORTH DAKOTA, OKLAHOMA, 
MONTANA, MINNESOTA, AND COLORADO 


APPENDIX 


TABULATION SHEETS ON NORTH DAKOTA. OKLAHOMA. 

MONTANA. MINNESOTA. AND COLORADO 

The raw data used for tabulating labeling errors of type 1 and 2 dots for 
selected segments In North Dakota (18). Oklahoma (11). Montana (10). Min- 
nesota (6). and Colorado (6) are presented In tables A-1 to A-10. 
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TABLE A-1.— Concluded. 
























































































































































NORTH DAKOTA PHASE III BLIND SITE DATA - TYPE 1 DOTS 
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TABLE A-2.— Continued 
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TABLE A-2.— Concluded 













































































TABLE A-3.- OKLAHOMA PHASE III BLIND SITE DATA - HPE 2 DOTS 
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TABLE A-3.~ Continued 
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TABLE A-9.- COLORADO PHASE III BLIND SITE DATA - TYPE 2 DOTS 
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TABLE A-10.- COLORADO PHASE III BLIND SITE DATA - TYPE 1 DOTS 
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TABLE A-10.- Concluded. 



fields procedure; no data. 














































